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BACKGROUND OF THE INVENTION 



10 1. Field of the Invention 

The present invention relates to methods and apparatus for synchronizing 
and propagating distributed routing databases. The invention also relates to 
methods for distributing routing data within a distributed processor router 
15 system. 

2. Backeround of the Related Art 

In the context of internetworking, routing is the coordinated transfer 
20 of information from a source to a destination via hardware known as a 
router. Routing occurs at Layer 3, the network layer, of the OSI reference 
model of the ISO (International Society for Standardization). The OSI 
reference model is a conceptual model composed of seven layers, each 
specifying particular network functions. The two lowest layers (layers 1 and 
25 2) of the OSI model, namely the physical and data link layers, are 

implemented in both hardware and software. Layer 3 and layers upwards 
therefrom are generally implemented only in software. 
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Using terminology of the International Organization for 
Standardization (ISO), network devices may be classified as follows. Those 
devices with the capability to forward packets between subnetworks are 
referred to as intermediate systems (ISs). (In contrast, network devices 
5 without such capabilities are called end systems). Intermediate systems may 
be classified as intradomain ISs, i.e., those which can communicate within 
routing domains, and interdomain ISs which can communicate both within 
and between routing domains. A routing domain^ or autonomous system, 
can be considered to be a part of an internetwork which is regulated under 
10 common administrative authority. 

A key component of routing is determination of optimal routing 
paths for data packets. Thereafter a second component, which may be 
referred to as "forwarding", comprises transporting packets through the 

15 internetwork. Determination of optimal routing paths relies on one or more 
routing protocols to provide and update a routing database for each router in 
a network. Depending on the particular routing protocol(s) used, various 
metrics are involved in building the routing database. Metrics that may be 
used by various routing protocols, either singly or as components of hybrid 

20 metrics, include: bandwidth, cost, path length, reliability, and load. Such 
metrics are well known in the art. 



Routing protocols are used to determine best routes for transporting 
packets through an internetwork. Routing in a network can be classified as 
25 either dynamic or static. Static routing is accomplished by using table 

mappings which are entered by a user (e.g. network administrator) prior to 
routing, and are only changed by user input. Dynamic routing is 
accomplished by routing protocols that adjust to changing network 
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conditions in response to incoming route update information. As a result, 
routes are recalculated, new routing update messages are sent out to peer 
routers, and updated routing databases are constructed. Routing protocols 
may be interior or exterior. Conventionally, interior routing protocols are 
5 used for determining routes within a routing domain. Examples of interior 
routing protocols are Routing Information Protocol (REP) and Open 
Shortest Path First (OSPF). Exterior routing protocols exchange routing 
information between routing domains. Examples of exterior routing 
protocols are Border Gateway Protocol (BGP) and Exterior Gateway 
10 Protocol (EGP). 

OSPF is a unicast routing protocol that requires each router in a 
network to be aware of all available links in the network. OSPF calculates 
routes from each router running the protocol to all possible destinations in 
15 the network. Intermediate System to Intermediate System (IS-IS) is an OSI 
Unk-state hierarchical routing protocol based on DECnet Phase V routing, 
whereby ISs (routers) exchange routing information based on a single 
metric, to determine network topology. 

20 BGP performs interdomain routing in TCP/IP networks. As an 

exterior gateway protocol (EGP), BGP performs routing between multiple 
routing domains and exchanges routing and reachability information with 
other BGP systems. Each BGP router maintains a routing database that lists 
all feasible paths to a particular network. The router does not refresh the 

25 routing database, however. Instead, routing information received from peer 
routers is retained until an incremental update is received. BGP devices 
exchange routing information upon initial data exchange and after 
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incremental updates. When a router first connects to the network, BGP 
routers exchange their entire BGP routing tables. 

In order to update their routing databases, routers send and receive 
information regarding network topology. Examples of such information 
include routing update messages, and Unk-state advertisements. By 
communicating with other routers in this way, each router obtains a routing 
database that defines the current topology of the network of which it is a 
part, enabling determination of optimal routing path. 

Entries are added to and removed from the route database either by 
the user (e.g., a network administrator) in the form of static routes, or by 
various dynamic routing protocol tasks. In dynamic routing, routes are 
updated by software running in the router. The routing database defines a 
mapping fi-om destination address to logical (output) interface, enabling the 
router to forward packets along the best route toward their destination. The 
route database is also the principal medium used to share routes among 
multiple active routing protocols. Thus, the routing database comprises an 
essential entity at the heart of every router. 

Typically, two or three routing protocols may be active in any one 
router. The routing database as such is a superset of the set of routes 
actually used for forwarding packets. This is due, in part, to the fact that 
different routing protocols compute their preferred routes independently of 
each other, based on different metrics. Only when all route entries generated 
by the fiiU complement of routing protocols are shared in the routing 
database, or route table, can the best routes be selected. The result of this 
selection is a subset of the routing database commonly referred to as the 
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forwarding table. The forwarding table can be considered a filtered view of 
the routing database. The forwarding table is used by all entities of the 
router that have to forward packets in and out of the router. 



5 In conventional or prior art non-scalable routers, which have a 

modest number of interfaces, there is a single copy of the routing database 
shared by all of the routing protocols. In non-scalable routers, the 
computational power available to the routing protocols is typically limited to 
a single processor. Also, in non-scalable routers, the number of entities 
10 requiring a copy of the forwarding table is relatively small. 



In contrast, in routers with a relatively large number of interfaces, a 
possibility exists for imposing much higher computational loads on the 
processor, up to a point where it is no longer feasible to run all routing 

15 protocols on the same processor. In order to realize improved performance 
from such routers, the protocol computational load must be distributed onto 
a plurality of processors. Furthermore, in routers with a very large number 
of interfaces, the number of entities requiring a copy of the forwarding table 
can be very large, for example, numbering several thousands. This latter 

20 situation also imposes higher computational loads and the need for a 
plurality of processors per router. 

However, running the routing protocols on a plurality of processors, 
each processor having a copy of the routing database, introduces a potential 
25 problem into the routing system. The problem is the critical requirement to 
keep all copies of the routing database consistent. This requirement is critical 
because the view of the routing database presented to the routing protocols 
is vital to correct routing. Moreover, the ability to provide an accurate and 
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timely copy of the forwarding table to a very large number of entities in the 
system is necessary in order to leverage the benefits provided by a 
distributed routing database environment. 



synchronization of the routing database and forwarding table to a large 
number of entities within a distributed processor environment of a scalable 
router. 



According to one aspect of the invention, there is provided a method 
for the synchronized distribution of routing data in a distributed processor 
15 router. The invention allows multiple routing databases, formed by 

distributed routing protocols, to be synchronized. The invention further 
allows distributed propagation of the synchronized database. 

One feature of the invention is that it enables the scaling of routing 
protocol tasks instantiated on multiple processors. Another feature of the 



20 invention is that it provides a distributed processor router environment, in 
which a plurality of processors host at least one of a plurality of different 
routing protocols. Another feature of the invention is that it provides a 
route table manager for controlling the propagation of a synchronized 
routing database within a distributed processor environment. 



One advantage of the invention is that it allows routing databases to 
be constructed and propagated in a distributed manner by instantiating 
routing protocol tasks on multiple processors. Another advantage of the 
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The instant invention provides a method for the distribution and 
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invention is that it provides a method for exchanging route data between a 
plurality of processors within a distributed processor router environment, 
wherein the exchange of route data is controlled by a route table manager 
(RTM). Another advantage of the invention is that it provides a method for 
5 registering a first RTM task as a client of a second RTM task in order to 
establish a first RTM task-second RTM task client-server relationship, 
wherein the first RTM task and the second RTM task occupy either the same 
hierarchical level or different hierarchical levels. Another advantage of the 
invention is that it provides a method for establishing a first RTM task- 
10 second RTM task client-server relationship, wherein the first RTM task is 
running on a line card of a highly scalable router, and the second RTM task 
is running on a control card of the same router. 

These and other advantages and features are accomplished by the 
15 provision of a method of synchronized distribution of routing data in a 
distributed processor router, including the following steps: a) running zero 
or more routing protocols of a complement of routing protocols on each of a 
first plurality of processors, wherein each routing protocol of the 
complement of routing protocols generates routing data; b) registering each 
20 of the first plurality of processors with at least one other of the first plurality 
of processors; c) exchanging the routing data between members of the first 
plurality of processors, such that each of the first plurality of processors 
receives a fiiU complement of routing data generated by the complement of 
routing protocols, the complement of routing data providing a complete 
25 routing database; d) forming a forwarding database fi'om the complete 
routing database, the forwarding database comprising a subset of the 
complete routing database; and e) propagating the forwarding database fi'om 
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the first plurality of processors to a second plurality of processors, wherein 
the second plurality of processors are not running a routing protocol. 

These and other advantages and features are accompUshed by the 
provision of a method of distributing routing data in a distributed processor 
router under the control of a route table manager (RTM), the router running 
a complement of routing protocols and the router having a plurality of 
control cards and a plurality of line cards, wherein this method includes: a) 
running an RTM task on each of the plurality of control cards, and running 
an RTM task on each of the plurality of line cards; b) generating routing data 
on at least one of the plurality of control cards; c) under the control of the 
RTM, distributing at least a portion of the routing data from at least one of 
the plurality of control cards to at least one other of the plurality of control 
cards; and d) again under the control of the RTM, distributing at least a 
portion of the routing data from at least one of the plurality of control cards 
to at least one of the plurality of line cards. 

These and other advantages and features are accomplished by the 
provision of a method of registering a route table manager (RTM) task client 
20 with a RTM task server within a distributed processor router, including the 
steps of: a) querying, via the RTM, a location service for a first node list, 
wherein the location service is a directory listing the location and status of 
each task running within the router, and wherein the first node list comprises 
a list of prospective RTM task servers; b) sending the first node list from the 
25 location service to a would-be RTM task client; c) selecting, by the would- 
be client, a first node fi:-om the first node list; d) sending a registration 
request fi-om the would-be client to the first node selected in said step c); e) 
adding the would-be client as a client of the first node selected, whereby the 
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first node selected is a server of the client; and f) after step e), sending a 
registration response from the server of the client to the client. 

These and other advantages and features are accomplished by the 
5 provision of a distributed processor router, including: a plurality of control 
cards, and a plurality of line cards, the plurality of control cards having a first 
plurality of processors, wherein each of the first plurality of processors runs 
zero or more routing protocols of a complement of routing protocols, each 
routing protocol of the complement of routing protocols generates routing 
10 data, and each of the first plurality of processors registers with at least one 
other of the first plurality of processors for exchange of the routing data 
between members of the first plurality of processors such that each of the 
first plurality of processors receives a fiiU complement of routing data. 

These and other advantages and features of the invention will be set forth in 
part in the description which follows and in part will become apparent to those 
having ordinary skill in the art upon examination of the following, or may be learned 
fi*om practice of the invention. The advantages of the invention may be realized and 
attained as particularly pointed out in the appended claims. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figs. lA, IB and IC are block diagrams showing basic architecture of a 
scalable router according to an embodiment of the invention; 

Fig. 2 schematically represents exchange of route data generated by 
different routing protocols, according to an embodiment of the invention. 

Fig. 3 schematically represents exchange of route data generated by 
two different routing protocols showing four servers and two clients, 
according to an embodiment of the invention. 

Fig. 4 schematically represents chronology of RTM-mediated data 
flow between two control cards, according to one embodiment of the 
invention.; 

Fig. 5 schematically represents a hierarchical relationship of RTM 
tasks according to a preferred embodiment of the invention. 

Fig. 6 schematically represents a hierarchical relationship between 
route table manager tasks, according to one embodiment of the invention; 

Fig. 7A schematically represents the distribution of route data from a 
route table manager Level- 1 task primary server to a route table manager 
Level-2 task client, according to the invention; 

Fig. 7B schematically represents the distribution of route data fi*om a 
route table manager Level- 1 task secondary server to a route table manager 
Level-2 task client, according to an embodiment of the invention; and 

Fig. 8 schematically represents a series of steps involved in a method 
for synchronized distribution of routing data within a distributed processor 
router, according to another embodiment of the invention. 
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ACRONYMS 



The following acronyms and abbreviations are used in the description which follows: 

BGP: Border gateway Protocol 

CCB: Client Control Block 

IS-IS: Intermediate System to Intermediate System 

ITC: Inter task communication 

LS: Location Service 

OSPF: Open Shortest Path First 

RTM: route table manager (also referred to as the global route table 
manager (GRTM)). 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

In order to place the invention in perspective for the better understanding 
thereof, there now follows, with reference to Figs. 1 A-IC, a brief description of 
a scalable router which may be used in conjunction with the instant invention. 
Fig. lA is a block diagram showing the basic architecture of a router 10. Each 
router 10 may include a plurality of shelves, represented in Fig. 1 A as 20 A to 
20N. As shown in Fig. IB, each shelf 20 can include a plurality of line cards, 
represented as 40A to 40N. For the purpose of clarity, only two control cards 
are shown in Fig. IB; however, it is to be understood that in practice larger 
numbers of control cards can be used according to the invention. Each control 
card 30 is in communication with at least one line card 40. For example, control 
card 3 OA is shown as being in communication with line cards 40 A and 40N on 
shelf 20A. Again, for the purpose of clarity, only two line cards are shown as 
being in communication with control card 3 OA. However, according to the 
invention, larger numbers of line cards may be connected to each control card. 
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Fig, IC shows line card 40, which could be any of the line cards from 
a shelf of router 10, Line card 40 has a plurality of ports, or exterior 
interfaces, 50A, SOB, through SON connected thereto. Although, only three 
5 interfaces are depicted in Fig. IC, it is to be understood that a much larger 
number of interfaces may be used in practice. 

Introduction to a Route Table Manager 

10 A route table manager (RTM) of the instant invention is a 

multifaceted software suite having a plurality of fiinctions (tasks) that 

include, but are not necessarily limited to, the following: 

1 . messaging between RTM task servers and RTM task clients to 

form scalable and fault tolerant distribution topologies; 
15 2, managing exchange of database information between RTM tasks 

running on separate processors within a distributed processor environment; 

3 . constructing a routing database from the sum of database 
information a) generated locally by tasks running on a local processor, and b) 
generated by and received from tasks running on at least one remote 

20 processor; 

4. constructing a forwarding database from the routing database; and 

5. propagating the forwarding database from RTM tasks having a 
higher hierarchical level (Level- 1 tasks) to RTM tasks having a lower 
hierarchical level (Level-2 and lower-level tasks). 

25 

In a distributed multi-processor router, such as is encountered 
according to certain aspects of the instant invention, the RTM distributes 
information on dynamic routes, static routes, and interface information, 
hereafter referred to as database information. In return, RTM relies on a 
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number of tasks (services) for database information updates. Such tasks 
include those of dynamic routing protocols, IP, and an interface manager 
task. Routing protocols provide the RTM with updates on dynamic routes. 
IP tasks provide the RTM with updates on static routes. The interface 
manager task manages the ports, or external interfaces, of the router system, 
and provides the RTM with interface information. Interface information 
relates to a specific interface from which to dispatch a particular packet. 
Interface information, in general, is well known in the art. 

The sum of the database information provided by services is 
collectively referred to as the routing database. Route entries maintained in 
the routing database include best and non-best routes. For example, all route 
entries that were injected by different routing protocols of the system's 
complement of routing protocols are stored in the routing database. 
However, for a plurality of entries having the same destination prefix, only 
one of the entries is deemed the best. The decision as to which of those is the 
best entry (i.e. the best route for forwarding a packet) is based on a pre- 
configured preference value assigned to each routing protocol. For example, 
if static routes have a high preference value and IS-IS routes have a low 
preference value, and a route entry having the same destination prefix was 
injected by each protocol, although both entries will remain in the routing 
database, the static route is considered to be the best route. In embodiments 
of the invention, both the best routes and the non-best routes, as well as 
interface information, are retained in the routing database, A subset of the 
routing database exists which is referred to as the forwarding table. The 
forwarding table contains all route entries that are deemed the best plus all 
interface information. Therefore, according to the invention, both the best 
routes and the interface information define the forwarding table. 
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A task of the RTM software suite typically runs on each of the 
plurality of processors of a multi-processor scalable system, including 
processors on control cards and line cards. The RTM task executing on each 
5 processor can be classified as either a Level- 1 RTM task (LI) or a Level-2 
RTM task (L2), and the processor may be termed an LI or an L2 as a result. 
The distinction between an LI and an L2 is in general the presence of either 
a routing database or a forwarding table. An LI RTM task maintains the 
routing database and an L2 RTM task maintains the forwarding table. A 

10 subset of the plurality of processors of the system is statically configured to 
host an LI RTM task and is referred to as the LI pool. All other processors 
of the system outside of the LI pool host an L2 RTM task. 

As previously described, the RTM depends on a number of services 
for updates in routing information. A processor within the LI pool may be 

15 running a number of such services, or none at all. Examples of such services 
include the IP routing protocols, OSPF, BGP, integrated ISIS, etc. (See, for 
example, C. Huitema, Routing in the Internet^ 2"^^ Edition, Prentice Hall 
PTR, 2000.) According to the invention, each LI is responsible for 
constructing a routing database from information generated in part by the 

20 local service(s), and in part from information generated by services running 
in association with other Lis. To obtain information that is generated by 
non-local services, i.e. information generated by services running on other 
Lis, an LI must register with at least one other LI where the service is 
running. According to the invention, in order to efficiently exchange 

25 locally generated information between Lis, each LI can register with at 
least one other LI as needed, on a per- service basis, to receive updates on 
the full complement of route data which is generated non-locally. 
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Lis register with each other for distribution of the following types of 
database information: dynamic routes including best and non-best routes, 
static routes including best and non-best routes, and interface information. 
An LI is classified as an LI server or LI client for a given type of database 
5 information, depending on the existence of local services. An LI task is an 
LI server for a particular type of database information if the service which 
generates that information is running locally. An LI task is an LI client for a 
particular type of database information if the service which generates that 
information is not running locally and the LI task has registered with an LI 

10 server for information of this type. For example, if a BGP task was running 
on a given processor, the LI task on that processor is considered an LI 
server for BGP route information. If the same LI task has registered with a 
remote LI task for OSPF route information, the former LI task is 
considered an LI client of the remote LI task with regard to OSPF route 

15 information. 

Fig. 2 schematically represents exchange of route data, generated by 
different routing protocols, between a plurality of control cards 30A, SOB, and 
SON within a distributed processor, scalable router, according to one 

20 embodiment of the invention. As alluded to hereinabove, the inventors have 
determined that superior performance from a scalable router is attained when 
routing protocols are distributed among control cards of the router. That is, 
superior performance is attained by running a plurality of different routing 
protocols on a plurality of processors within the control plane (on control cards) 

25 within the router. According to one embodiment, each of the plurality of 

processors is situated on a different control card of the router. With reference to 
Fig. 2, the plurality of control cards is represented by control cards 3 OA, SOB, 
and SON. In the example shown in Fig. 2 a service or routing protocol task runs 
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on each control card 3 OA, 308, 3 ON. Therefore, according to the definitions 
presented hereinabove, a Level- 1 task (LI) of the RTM is running on each 
processor. In particular, according to the example shown in Fig. 2, control 
cards 3 OA, 3 OB, 3 ON run routing protocol A, routing protocol B, and routing 
5 protocol N, respectively. Routing protocol A, routing protocol B, and routing 
protocol N, provide route data A, route data B, and route data N, respectively. 
As described hereinabove, the LI for each control card requires route data from 
the full complement of routing protocols running on the plurality of control 
cards 3 OA, 3 OB, and 3 ON. Lis therefore exchange route data by registering 
1 0 with other L 1 s on a per-service basis. 
U Fig, 3 schematically represents exchange of route data generated by two 

gi different routing protocols showing four servers and two clients, according to an 

'"zi embodiment of the invention. This aspect of the instant invention relates to the 

y 

bj registration of Lis with at least one other LI, on a per-service basis, for the 

%J 

\j 15 facile exchange of non-locally generated route data. Each entity I-IV represents 

an LI task: LI A, LI A, LIB, and LIB', respectively. For the purpose of this 
0"^ example, the routing protocol tasks are designated as routing protocol A (RPA) 

Q] in the case of LI A and LI A, and routing protocol B (RPB) in the case of LIB 

and LIB'. Under the control of the RTM, LI A registers as a client with both 
20 LIB and LIB' for information generated by routing protocol B, wherein both 
LIB and LIB' are servers. Similarly, LIB' registers as a client with both LIA 
and LIA for information generated by routing protocol A, wherein both LIA 
and LIA are servers. Thus, the same entity may have both client and server 
functionality concurrently. For the sake of clarity, LIA' and LIB are not shown 
25 as clients, but as servers only, therefore sending, rather than receiving 
information. 

In the arrangement shown in Fig. 3 LIA is registered with both LIB and 
LIB', which both run RPB, and LIB' is registered with both LIA and LIA, 
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which both nin RPA. This redundancy in preferred embodiments of the 
invention provides fault tolerance against the probability of failure of one or 
more LI servers. Fault tolerance in the system is fiirther described in a section 
below entitled Fault Tolerance. 

5 

Fig. 4 schematically represents the chronology of RTM-mediated data 
flow between control cards 3 OA and 308 of router 10, according to one 
embodiment of the invention. Only two control cards are depicted in Fig. 4, 
however it is to be understood that the principles of data flow could also apply 

10 to a larger number of control cards. Control cards 30A and 30B run services A 
and B, respectively. Each control card 30A and 30B also has an RTM task 
running, RTM A, RTM B, respectively. The fact of each of the processors 
running a service task dictates that RTM A and RTM B are both Level- 1 as 
defined hereinabove. Data flow is initiated by information injection fi*om service 

15 A to RTM A, as indicated by arrow 1 . From RTM A, information is distributed 
concurrently to both route table A and to RTM B, as indicated by the two 
arrows each labeled 2. Thereafter, information is distributed from RTM B to 
route table B, as indicated by arrow 3. Finally, information is received by Service 
B from route table B, arrow 4. Data flow of the type illustrated in Fig. 4 enables 

20 the timely distribution of routing database updates between a plurality of control 
cards within a distributed processor router, in which the plurality of control 
cards are jointly responsible for running a plurality of different services. 

By registration among Lis in the manner described herein, 
25 information generated by the full complement of services of the system can 
be effectively exchanged between Lis, with the result that each LI maintains 
a synchronized routing database. Scalability of the distribution of database 
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information among Lis is achieved by the formation of distribution trees 
during the registration process. 

According to the invention, each LI task will maintain a 
5 synchronized copy of the routing database. Each LI task has the role of 
constructing a synchronized forwarding table for use by L2 tasks, wherein 
the forwarding table consists of a subset of the routing database. 
Specifically, the routing database consists of all route entries, both best and 
non-best as defined above, as well as interface information. Each LI is able 
10 to construct the forwarding table, based on the routing database, by 
identifying the best route for each destination prefix. 

In this manner, when a best route is deleted fi'om the routing 
database, each LI can immediately replace the deleted "best route" with the 
15 next best route in the forwarding table which matches the particular 
destination prefix. 

An L2 task is an RTM task which is running on a processor outside 
of the LI pool. Each L2 requires a copy of the forwarding table. The 
20 source for forwarding table information are LI tasks that are running 
throughout the system. 

The hierarchical relationship of RTM tasks, according to a preferred 
embodiment of the invention, is schematically represented in Fig. 5. Lis 
25 represent the highest level, or top layer, of the hierarchical relationship. As 
described above. Lis are Level- 1 RTM tasks which maintain a synchronized 
copy of the routing database and are the source of the forwarding table, 
whereas L2s are Level-2 RTM tasks which only maintain a copy of the 
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forwarding table. L2s themselves can occupy different hierarchical levels. 
In order to distinguish between L2s which occupy different hierarchical 
levels, L2 nodes which are clients of LI servers as well as servers of L2 ^ 
clients may be designated L2's; while L2s which are cheats of L2' nodes may 
5 be designated L2"s. Thus, immediately below the Lis, at the intermediate 
hierarchical level or layer, lie L2s that are registered with Lis for forwarding 
table information. Below the intermediate hierarchical level lie L2*s which 
are registered with an L2 node. Further, L2"s may be registered with L2's. 
According to a preferred embodiment, the depth of the topology shown in 
10 Fig. 5 is kept low by having a large fan-out at Layer 1 . Again with reference 
Q to Fig. 5, it should be noted that although only a single server is shown for 

%t each client, according to a currently preferred embodiment of the invention 

designed for fault tolerance, i.e. tolerance of the router system to failure of a 
lij RTM task server, each client has at least two servers. In practice, for a 

•S] 15 given L2" cUent (Layer 4), one server can be a Layer 1 server (LI), and the 

other can be a Layer 2 node. 

'fll According to the invention, communication between RTM task 

clients and RTM task servers takes place to form scalable and fault tolerant 
20 distribution topologies. Among LI tasks, distribution trees are formed for 
the propagation of routing database information. An LI task which is 
running in association with a given service has the role of sourcing routing 
database information generated by that service. Distinct distribution trees 
therefore exist per service for the exchange of routing database information 
25 among LI tasks. In a similar manner, distribution trees for the propagation 
of the forwarding table are formed with L I tasks as the source of forwarding 
table information and L2 tasks as the nodes and leaves. 
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The RTM interacts with a Location Service module to determine the 
location of all RTM tasks running within router system 10. That is, the 
Location Service (LS) functions as a directory service. Interactions of the 
RTM with the LS include: (1) LI RTM tasks, running on a control card 30, 
5 query the LS to determine the location of any RTM tasks acting as the 

source of routing database information for a particular service; (2) L2 RTM 
tasks query the LS to determine the location of any LI RTM tasks (sources 
of forwarding table information); (3) LS notifies the RTM in the event that 
an RTM task comes up or goes down and (4) RTM tasks provide LS with 
10 RTM task type (including the routing database source) and level information 
n to answer queries described in (1) through (3). 

b: As described above, Lis are responsible for propagating the 

yj forwarding database to the Level-2 tasks (L2s). This is accomplished by the 

Kj 15 estabUshment of L1-L2 client-server relationships. L2 nodes register with 

LI s for the forwarding table only (i.e., L2 nodes register for the forwarding 
01 table "service"). According to one aspect of the invention, an LI server will 

accept N L2 clients, where N is determined, at least in part, by the 
%i configured maximum fan-out. This situation is schematically represented in 

20 Fig. 6, in which an LI server (LI A) already has N L2 clients, represented by 
L2A, L2B, and up to L2N. Client M represents an L2 that is not a client of 
an RTM task running in the control plane of the router system. If cUent M 
then signals a request to register with LI A (arrow 1), that request is denied 
as represented by arrow 2. If maximum fan-out has been reached on all Lis 
25 in the control plane, client M then requests registration (arrow 3) with an L2, 
e.g. L2A, that is a client of an LI (in this case LI A). A registration response 
message is then sent fi*om L2A to cUent M, as represented by arrow 4. 
Client M can now receive forwarding table updates fi*om LI A via L2A. 
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Maximum fan-out in L1-L2 client-server relationships is determined, inter 
alia, by CPU load. In case maximum fan-out of all L2 servers has been 
reached, then a client can force registration. This client-server registration 
procedure is used to form distribution trees for the propagation of the 
5 forwarding database among all L2 clients. Information on the location of the 
servers is available from the LS. According to a currently preferred 
embodiment, the LS itself runs on all control cards 30 and line cards 40 of 
router system 10. 

10 It will be apparent to the skilled artisan that the client-server 

registration procedure described here is hierarchically based, in that L2s first 
attempt to register with Lis until maximum fan-out has been reached, and 
only then will an L2 attempt to register with an L2 that is registered as a 
client of an LL An L2 which acts as a server to an L2 client may be 

15 designated L2', and an L2 client of an L2' server may be designated L2" (Fig. 
5). Large scale distribution is therefore achieved by using a reliable multicast 
transmission at the tree nodes. In general, the number of L2s is greater than 
the number of Lis. According to one embodiment, the ratio of Lis to L2s 
ranges from about 1 : 1 to about 1:15. 

20 

Fault Tolerance 



Fault tolerance in the system of the invention, as alluded to briefly 
above, is achieved by redundancy in registration, and therefore in 
25 communication. As a client, an LI or L2 task registers with at least two 
servers from which it may receive the same information. One of the servers 
with which the client registers is considered a primary server, and the other 
a secondary. The client communicates exclusively with the primary unless 
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and until the primary fails in some manner to deliver, and then the client 
turns to the secondary for database updates. Service is thus uninterrupted. 

In the event of a server failure, and a necessary switchover by a 
5 client to its secondary server, the client receives a copy of the secondary 
server's database. If the client is a node in a distribution tree, it simply 
delivers to its clients the difference between the existing database and the 
copy of the database received from the secondary server. 

10 Referring now to Fig. 7 A, the role of a control card as a Level-2 

node is to receive forwarding entries from its primary LI server, and then to 
redistribute the forwarding entries to its own clients, represented as L2 
clients A, B, and C. The L2 node is registered with two LI servers, the 
primary LI server and the secondary LI server, for the purpose of fault 

15 tolerance, as schematically represented in Fig. 7 A. 



Referring now to Fig. 7B, if the primary LI server fails, the L2 node 
activates its secondary LI server. When the secondary LI server is activated, 
Q it delivers a complete copy of its database to the L2 node, as schematically 

20 represented in Fig. 7B. When the L2 node receives the copy of the entire 
table from the secondary LI server, it compares that copy to its existing 
database, and calculates the difference between the two. It only needs to 
distribute to L2 clients A, B and C the difference between the entire new 
table and its existing table. 



25 



Fig. 8 schematically represents a series of steps in a method for the 
synchronized distribution of routing data within a distributed processor, 
highly-scalable router, according to one embodiment of the invention. Step 
800 of Fig. 8 involves running at least one routing protocol of a complement 
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of routing protocols on individual ones of a first plurality of processors, 
wherein each routing protocol of the complement of routing protocols 
generates routing data. This first plurality of processors are the LI 
processors described in detail above. Also as previously described, it is the 
5 configuration of the Lis to run routing protocols and to otherwise behave as 
Lis that makes them Lis. An LI may not be running a routing protocol, but 
still be an LI . That is, an LI may obtain all of its routing data from other 
Lis with which it registers as a cUent. 

Step 802 involves registering each of the first plurality of processors 
10 with at least one other of the first plurality of processors. Step 804 involves 
G exchanging the routing data between members of the first plurality of 

□1 processors, such that each of the first plurality of processors receives a fiill 

if complement of routing data generated by the complement of routing 

yj protocols. The complement of routing data received by each of the first 

15 plurality of processors provides a complete routing database. Step 806 

involves forming a forwarding database fi*om the complete routing database 
3^ provided as a result of step 804. The forwarding database formed in step 

: Iff 

nj 806 is comprised of a subset of the complete routing database provided in 

Q step 804. 

20 Step 808 involves propagating the forwarding database fi-om the first 

plurality of processors to a second plurality of processors of the distributed 
processor router, wherein the second plurality of processors are 
characterized as not running (or being configured to run) routing protocols. 
The method steps 800 through 808 may be sequentially repeated over time, 
25 for example, when updated reachability information is received from one or 
more peer routers of the distributed processor router. 
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General Applicability 



The embodiments of the present invention described in enabling 
detail above have all been related to routing of data packets in multi- 
processor, scalable routers. These embodiments are exemplary, and 
represent a preferred application of the new technology described, but are 
not limiting in applicability of the invention. There are numerous other 
situations and systems in which the apparatus and methods of the invention 
may provide advantages. These situations include all situations in which 
multiple processors may be employed in parallel processing applications, 
wherein maintenance of one or more common databases is the object. The 
description of the present invention is intended to be illustrative, and not to 
limit the scope of the appended claims. Many alternatives, modifications, 
and variations will be apparent to those skilled in the art. 
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